Effective dimensionality

Formally, we can think of effective dimensionality of a dataset by considering the Eigenvalues of the Covariance matrix.

With the pseudoprobability defined by eigenvalues,

$p_k = \frac{\lambda_k}{\sum_{k'=1}^{K}\lambda_k'}$

We can calculate the entropy

$H(\mathbf{p}) = -\sum_{k=1}^{K}p_k \log(p_k)$

then the effective dimensionality $d$ can be calculated by

$d = \exp \left[ H(\mathbf{p}) \right]$

from

$- \sum_{k=1}^{d} \frac{1}{d} \log \left( \frac{1}{d} \right) = H(\mathbf{p})$